Skip to content

Comments

Mirror of linear changes#2

Open
kwyss-nvidia wants to merge 56 commits intokwyss/cublas_gemm_github_mrfrom
kwyss/subchannel_recipe_linear
Open

Mirror of linear changes#2
kwyss-nvidia wants to merge 56 commits intokwyss/cublas_gemm_github_mrfrom
kwyss/subchannel_recipe_linear

Conversation

@kwyss-nvidia
Copy link
Owner

A more reviewable mirror of the changes from NVIDIA#1559

@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch 2 times, most recently from 07d55ea to 8bb7d63 Compare March 12, 2025 21:39
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from c3eebe7 to b848509 Compare March 12, 2025 21:42
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from 8bb7d63 to 365a4d9 Compare March 13, 2025 00:06
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from b848509 to 1058efc Compare March 13, 2025 00:07
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch 3 times, most recently from 6c70366 to 08aa4de Compare March 15, 2025 00:22
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch 2 times, most recently from eee37bf to ce4ca80 Compare March 17, 2025 17:24
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch 2 times, most recently from 51fbe41 to 78c194d Compare March 17, 2025 17:33
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from ce4ca80 to 5ebc93a Compare March 19, 2025 22:42
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from 78c194d to 8f4f0f0 Compare March 19, 2025 22:43
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch 2 times, most recently from 1d112ac to 48648a9 Compare April 1, 2025 19:43
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from 5aa279e to 8466c36 Compare April 1, 2025 19:45
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from ca005ab to e35f2b6 Compare April 1, 2025 21:46
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from 8466c36 to e788ca2 Compare April 1, 2025 21:48
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch 2 times, most recently from 22828fe to 413331d Compare April 1, 2025 23:23
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from e788ca2 to 9ac89ea Compare April 1, 2025 23:23
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch 4 times, most recently from db5b49e to 8d59b0a Compare April 2, 2025 18:52
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from 9ac89ea to fa019d5 Compare April 2, 2025 18:53
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from 8d59b0a to 3424dc7 Compare April 2, 2025 19:19
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/cublas_gemm_github_mr branch from fa019d5 to cd3e414 Compare April 2, 2025 19:20
kwyss-nvidia and others added 14 commits April 7, 2025 16:50
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
* Use dummy wgrads for lower memory consumption

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>

* Bug fix to avoid sharing gradients.

Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>

* Disable automatic use of batch_p2p_comm for CP2

Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>

* Change weight to origin_weight for LN_LINEAR

Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>

---------

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>
Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Signed-off-by: zhongboz <zhongboz@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
* Minor stylistic tweaks and typo fixes

Review suggestions from @ptrendx

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix incorrect col strides for MXFP8 matrices

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from d7775fc to b62d555 Compare April 8, 2025 23:35
kwyss-nvidia and others added 2 commits April 8, 2025 16:39
Apply MR comment change.

Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: kwyss-nvidia <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from 8fc753d to 67e790b Compare April 9, 2025 00:05
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
@kwyss-nvidia kwyss-nvidia force-pushed the kwyss/subchannel_recipe_linear branch from 6948759 to ea9e46b Compare April 9, 2025 01:32
kwyss-nvidia and others added 10 commits April 8, 2025 18:49
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
* scaling enum abstract

* rm NVTE_ from ScalingMode names

* rework scaling mode enum in grouped gemm

* fix norm sharding

---------

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
…r op backward (NVIDIA#1646)

Explicitly specify quantized tensor usages needed for linear op backward

Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Debug checkpointing with te.Sequential

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Keith Wyss <kwyss@nvidia.com>
Signed-off-by: Xin Yao <yaox12@outlook.com>
Signed-off-by: Xin Yao <yaox12@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants